author:
  - Yossi Gandelsman
  - Yu Sun
  - Xinlei Chen
  - Alexei Efros
submission:
  - Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
year: "2022"
file:
  - "[[learning_methods/test_time/paper/03. Test-Time Training with Masked Autoencoders.pdf|03. Test-Time Training with Masked Autoencoders]]"
related: 
tags:
  - Test-Time-Training
review date: 2024-11-12

Summary

1-Line

Test-Time Training (TTT)을 Masked Autoencoder(MAE)와 결합하여 distribution shift 문제를 해결하는 방법을 제안 [Abstract, p.1]

3-Line

기존 TTT는 rotation prediction을 self-supervision task로 사용했으나, 이는 일반화가 어려운 한계가 있음 [Section 2.2, p.2-3]

MAE를 self-supervision task로 활용하여 TTT를 수행함으로써 test distribution shift에 더 잘 적응할 수 있음 [Section 1 & Section 3, p.1, p.3-4]

Linear model 분석을 통해 TTT가 bias-variance tradeoff 측면에서 효과적임을 이론적으로 증명 [Section 5, p.8-9]

5-Line

Distribution shift 문제를 해결하기 위한 Test-Time Training(TTT) 방법론 제안

기존 TTT의 rotation prediction task 대신 MAE를 self-supervision task로 활용

MAE와 ViT 기반의 Y-shaped 아키텍처를 통해 feature extraction 수행

ImageNet-C, ImageNet-A, ImageNet-R 등 다양한 distribution shift 벤치마크에서 성능 향상 확인

Linear model 분석을 통해 TTT의 이론적 근거 제시

Paper Review

Abstract / Motivation

기존 TTT 방법의 rotation prediction task는 일반화가 어려운 한계가 있음 [Section 2.2, p.2]
MAE를 self-supervision task로 활용하여 더 나은 generalization 성능을 얻고자 함 [Section 1 Introduction, p.1]

Background

기존 TTT는 rotation prediction을 self-supervision task로 사용 [Section 2.2, p.2-3]

MAE는 spatial redundancy를 활용한 self-supervised learning 방법 [Section 2.3 "Self-Supervision by Spatial Autoencoding", p.3]

Targeting Problems

Distribution shift로 인한 generalization 성능 저하 [Section 1 Introduction, p.1]
Rotation prediction의 task 한계성 [Section 2.2, p.2]:

Natural outdoor scenes에서는 horizon 검출로 인해 너무 쉬운 문제가 됨
Top-down view에서는 rotation이 모호해져 너무 어려운 문제가 됨

Suggestions / Methods

Architecture [Section 3 Method, p.3-4]:

Y-shaped architecture 사용
MAE encoder를 feature extractor로 활용
Self-supervised head(MAE decoder)와 main task head(ViT) 결합

Training [Section 3 Method & Section 4.1 Implementation Details, p.3-5]:

MAE pre-trained model 활용
ViT probing으로 main task head 학습
Test time에 SGD optimizer 사용하여 adaptation 수행

Improvements [Section 4.2, 4.3 & Section 5, p.5-9]:

다양한 corruption type에서 더 나은 성능 달성
Rotation invariant class에서도 성능 향상
Linear model 분석을 통한 이론적 근거 제시